Looking at alternatives within the framework of n-gram based language modeling for spontaneous speech recognition

نویسندگان

  • Luc Lussier
  • Edward W. D. Whittaker
  • Sadaoki Furui
چکیده

This paper presents different methods using a weighted mixture of word and word-class language models in order to perform language model adaptation. A general language model is built from the whole training corpus, then several numbers of clusters are created according to a word co-occurrence measure and finally, word models as well as word-class models are built from each cluster. The general language model is then combined with one or several other models chosen according to a minimum perplexity criterion. Results show an absolute reduction of the word error rate of 1.40% and 0.49% on average for two different test sets of the “Corpus of Spontaneous Japanese.” Keyword Spontaneous speech recognition, language model adaptation, document clustering, word models, word-class models, EM algorithm

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling linguistic segment and turn boundaries for n-best rescoring of spontaneous speech

Language modeling, especially for spontaneous speech, often suffers from a mismatch of utterance segmentations between training and test conditions. In particular, training often uses linguistically-based segments, whereas testing occurs on acoustically determined segments, resulting in degraded performance. We present an N-best rescoring algorithm that removes the effect of segmentation mismat...

متن کامل

Unified language modeling using finite-state transducers with first applications

In this paper, we investigate a weighted finite-state transducer approach to language modelling for speech recognition applications. We explore a unified framework to conversational speech recognition which combines the benefits of grammars, n-gram and class-based language models, with the flexibility of using dynamic data, and the potential for integrating semantics. Based on a virtual persona...

متن کامل

Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition

While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a three-word window. This paper proposes a new language modeling approach to capture the preferred re...

متن کامل

Transforming out-of-domain estimates to improve in-domain language models

Standard statistical language modeling techniques suffer from sparse-data problems when applied to real tasks in speech recognition, where large amounts of domain-dependent text are not available. In this work, we introduce a modi ed representation of the standard word n-gram model using part-of-speech (POS) labels that compensates for word and POS usage di erences across domains. Two di erent ...

متن کامل

Mandarin Pronunciation Modeling Based on Cass Corpus1

The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. In this paper, the factors that may affect the recognition performance are analyzed, including those specific to the Chinese language. By studying the INITIAL/FINAL (IF) characteristics of Chinese language and developing the Bayesian equation, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004